Wurm lab | Publications | Teaching | Team | News | Tools

MSc Bioinformatics

QA and Assembly

bmpvieira.com/assembly14

bmpvieira

Bruno Vieira | @bmpvieira

Phd Student @ QMUL

Bioinformatics and Population Genomics

Supervisor:
Yannick Wurm | @yannick__

© 2014 Bruno Vieira CC-BY 4.0


Download data

bit.ly/ant-reads

Useful books

Papers

De novo genome assembly: what every biologist should know

Assemblathon 2: evaluating de novo methods of genome assembly[...]


Genome Assembly


Chen 2011


Types

Algoritms

Strategies


Assembly paradigms


Overlap/Layout/Consensus


Overlap/Layout/Consensus

Chen 2011


de Brujin


de Brujin

Chen 2011


Schatz 2012


Schatz 2012


Too many assemblers

seqanswers.com/wiki/De-novo_assembly


A5, ABySS, ALLPATHS, CABOG, CLCbio, Contrail, Curtain, DecGPU, Forge, Geneious, GenoMiner, IDBA, Lasergene, MIRA, Newbler, PE-Assembler, QSRA, Ray, SeqMan NGen, SeqPrep, Sequencher, SHARCGS, SHORTY, SHRAP, SOAPdenovo, SR-ASM, SuccinctAssembly, SUTTA, Taipan, VCAKE, Velvet


Benchmarking


Why we need the assemblathon


Assembly quality assessment


Assembly quality assessment



Assembly quality assessment


N50 must die?


Assembly quality assessment


Assembly quality assessment



FastQC

FastQC Documentation




Diginorm

"(...)systematizes coverage in shotgun sequencing data sets, thereby decreasing sampling variation, discarding redundant data, and removing the majority of errors."


Diginorm

"(...)reduces the size of shotgun data sets and decreases the memory and time requirements for de novo sequence assembly, all without significantly impacting content of the generated contigs."

Magic? No, Bloom filters


Diginorm

What is digital normalization, anyway?

Why you shouldn't use digital normalization


Fasta


Fastq


Fastq



Interleaved format


Practical

bmpvieira.com/assembly14-practical



Copyright Authors. All rights reserverd.